Modulation spectrum-constrained trajectory training algorithm for HMM-based speech synthesis

نویسندگان

  • Shinnosuke Takamichi
  • Tomoki Toda
  • Alan W. Black
  • Satoshi Nakamura
چکیده

This paper presents a novel training algorithm for Hidden Markov Model (HMM)-based speech synthesis. One of the biggest issues causing significant quality degradation in synthetic speech is the over-smoothing effect often observed in generated speech parameter trajectories. Recently, we have found that a Modulation Spectrum (MS) of the generated speech parameters is sensitively correlated with the over-smoothing effect, and have proposed the parameter generation algorithm considering the MS. The over-smoothing effect is effectively alleviated by the proposed parameter generation algorithm. On the other hand, it loses the computationally-efficient generation processing of the conventional generation algorithm. In this paper, the MS is integrated into the training stage instead of the parameter generation stage in a similar manner as our previous work on Gaussian Mixture Model (GMM)-based spectral parameter trajectory conversion. The trajectory HMM is trained with a novel objective function consisting of both the conventional trajectory HMM likelihood and a newly implemented MS likelihood. This training framework is further extended to the F0 component. The experimental results demonstrate that the proposed algorithm yields improvements in synthetic speech quality while preserving a capability of the computationallyefficient generation processing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An introduction of trajectory model into HMM-based speech synthesis

In the synthesis part of a hidden Markov model (HMM) based speech synthesis system which we have proposed, a speech parameter vector sequence is generated from a sentence HMM corresponding to an arbitrarily given text by using a speech parameter generation algorithm. However, there is an inconsistency: although the speech parameter vector sequence is generated under the constraints between stat...

متن کامل

Speech Parameter Sequence Modeling with Latent Trajectory Hidden Markov Model

The weakness of hidden Markov models (HMMs) is that they have difficulty in modeling and capturing the local dynamics of feature sequences due to the piecewise stationarity assumption and the conditional independence assumption on feature sequences. Traditionally, in speech recognition systems, this limitation has been circumvented by appending dynamic (delta and delta-delta) components to the ...

متن کامل

Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences

In the present paper, a trajectory model, derived from the hidden Markov model (HMM) by imposing explicit relationships between static and dynamic feature vector sequences, is developed and evaluated. The derived model, named trajectory HMM, can alleviate some limitations of the standard HMM, which are i) piece-wise constant statistics within a state and ii) conditional independence assumption ...

متن کامل

Speech trajectory discrimination using the minimum classification error learning

In this paper, we extend the maximum likelihood (ML) training algorithm to the minimum classification error (MCE) training algorithm for discriminatively estimating the state-dependent polynomial coefficients in the stochastic trajectory model or the trended hidden Markov model (HMM) originally proposed in [2]. The main motivation of this extension is the new model space for smoothness-constrai...

متن کامل

Speech enhancement based on hidden Markov model using sparse code shrinkage

This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015